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DATABASE BUILDING METHOD FOR MULTIMEDIA CONTENTS 

This application claims the benefit under 35 U.S.C. § 119(e)(1) of and 
incorporates by reference U.S. Provisional Application No. 60/207,969 filed 
on May 31, 2000. This application also incorporates by reference Korean 
Patent Application No. 00-54868 filed on September 19, 2000. 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

The present invention relates to classification of multimedia data, and 
more particularly, to a database building method for multimedia data 
(hereinafter, referred to as multimedia contents) in which multimedia contents 
are semantically classified and stored in a predetermined database. 

2. Description of the Related Art 

On the World Wide Web (WWW), a great many multimedia contents 
are commonly used. However, retrieval methods are mainly for retrieving text 
data and fast and efficient retrieval methods for retrieving images, audio data, 
and motion video data having voices have not been introduced. 
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As the amount of multimedia data increases these days, a database 
building method for multimedia contents and a method for providing retrieval 
services to users using the established database are required. 

SUMMARY OF THE INVENTION 

To solve the above problems, it is an object of the present invention to 
provide a database building method for multimedia contents in which 
multimedia contents dispersed on the World Wide Web or other 
telecommunications networks are efficiently collected and stored in one 
database so that fast retrieval of multimedia contents is enabled. 

It is another object to provide a database building apparatus for 
multimedia contents, using the database building method for multimedia 
contents. 

It is another object to provide a multimedia contents retrieval method 
for fast retrieving multimedia contents in the database built by the database 
building method for multimedia contents. 

It is another object to provide a multimedia contents retrieval apparatus 
for using the retrieval method for multimedia contents. 

To accomplish the above object of the present invention, there is 
provided a database building method for multimedia contents, the method 
including the steps of (a) accessing an arbitrary site providing multimedia 
contents through a telecommunications network; (b) calling multimedia 
contents in by spidering the site; and (c) classifying the multimedia contents 



data according to the stored addresses and storing them in a predetermined 
database. 

Also, the multimedia contents data can be image data. 
It is preferable that the addresses are universal resource locators 
5 (URLs). 

It is preferable that the arbitrary site is selected between a retrieval site 
or a portal site. 

It is preferable that step (b) further includes the sub-steps of (b-1) 
S inputting a search word; (b-2) parsing texts corresponding to the file names of 

I|1 10 multimedia contents of texts corresponding to sub-categories in hyper text 

S z ; 

iVj markup language (HTML) web page data having the retrieved results for the 

I ST 

si input search word; and (b-3) calling multimedia contents data having 

O 

■|: addresses corresponding to the parsed texts. 

;j? It is preferable that before step (b-3) the method further includes (p-b- 

s ~ 15 3-1) visiting the corresponding category when the texts corresponding to the 

sub-category are parsed in the loaded HTML web page data. 

It is preferable that in step (b-2), keywords representing the 
characteristics of the texts together with the texts corresponding to the sub- 
categories and the texts corresponding to the file names of the multimedia 
20 contents are parsed in the loaded HTML web page data. 

It is preferable that after step (b-3) the method further includes the step 
of (b-4) filtering noise images out among the called images. 
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It is preferable that step (b-4) further includes the sub-steps of (b-4-1) 
determining whether or not the pixel number of a called image is equal to or 
greater than a predetermined threshold value; and (b-4-2) when the pixel 
number of a called image is equal to or greater than the predetermined 
5 threshold value, indexing the corresponding image. 

It is preferable that the threshold value is 128. 

It is preferable that step (c) further includes the sub-steps of (c-1) 
decreasing the resolution of the called image; and (c-2) storing the image, of 
which resolution was decreased, in a predetermined database according to the 
10 categorized structure. 

Alternatively, it is preferable that in step (c), the URL of the web page 
storing the called multimedia contents data is stored in a predetermined 
database using the URL information. 

Alternatively, it is preferable that in step (c), at least one of URL 
15 information or keyword information together with information on respective 
images is stored in respective predetermined databases so that keywords can 
be linked to individual images. 

To accomplish another object of the present invention, there is also 
provided a database building method for multimedia contents, the method 
20 including the steps of (a) accessing an arbitrary site providing multimedia 
contents using a database having a categorized structure; (b) calling 
multimedia contents data by spidering the site; and (c) storing the called 
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multimedia contents data to a predetermined database, using the categorized 
structure. 

To accomplish another object of the present invention, there is also 

provided a database building apparatus for multimedia contents, having a web 

5 visitor for accessing an arbitrary site providing multimedia contents and 

calling multimedia contents by spidering the site; and a database for 

classifying and storing the called multimedia contents data, using the 

_ categorized structure of the database of the site or the addresses storing the 

O 

■=~= called multimedia contents data. 

jfj 10 To accomplish another object of the present invention, there is also 

IjI provided a retrieval method for multimedia contents, the method including the 

^ steps of (a) receiving keywords corresponding to query images, which are 

:£ wanted to be searched, from a user; and (b) retrieving images corresponding to 

~ keywords in a predetermined database storing keywords corresponding to 

1 5 individual images together with a plurality of images. 

To accomplish another object of the present invention, there is also 
provided a retrieval apparatus for multimedia contents having a database 
storing a plurality of images and keywords corresponding the individual 
images; and a retrieval unit for receiving keywords corresponding to the query 
20 data, from the user, and retrieving multimedia contents data corresponding to 
the keywords in the database. 

BRIEF DESCRIPTION OF THE DRAWINGS 
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The above objects and advantages of the present invention will become 
more apparent by describing in detail a preferred embodiment thereof with 
reference to the attached drawings in which: 

FIG. 1 is a block diagram showing the structure of a database building 
5 apparatus for multimedia contents according to an embodiment of the present 
invention; 

FIG. 2 is a flowchart showing the major steps of a database building 
method for multimedia contents according to an embodiment of the present 
invention used in the apparatus of FIG. 1; 
10 FIG. 3 is a flowchart showing the major steps of a database building 

method for multimedia contents according to another embodiment of the 
present invention used in the apparatus of FIG. 1 ; 

FIG. 4 is a block diagram showing the structure of a multimedia 
contents retrieval apparatus according to an embodiment of the present 
15 invention; and 

FIG. 5 is a flowchart showing the major steps of a multimedia contents 
retrieval method according to an embodiment of the present invention used in 
the multimedia contents retrieval apparatus of FIG. 4. 

DETAILED DESCRIPTION OF THE INVENTION 
20 Hereinafter, embodiments of the present invention will be described in 

detail with reference to the attached drawings. The present invention is not 
restricted to the following embodiments, and many variations are possible 
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within the spirit and scope of the present invention. The embodiments of the 
present invention are provided in order to more completely explain the present 
invention to anyone skilled in the art. 

According to the present invention, multimedia contents are 
5 semantically classified so that retrieval or browsing can be efficiently done. 
For example, multimedia contents corresponding to "F-16 fighter" can be 
classified in a category referred to as "Gulf War". For this, the merit of the 
structure categorized in a retrieval site is used. For example, retrieval sites 
;S such as Yahoo TM have a categorized structure. For example, a text 

;>! 10 categorized by "movie" is clicked on, collected information of more detailed 

hi sites related to movies in text formats categorized such as "erotic", "action", or 

"human episode" is provided. Also, the addresses of detailed sites related to 
respective movies can be provided. The classification of such retrieval sites 
\}j and portal sites are well done semantically. Therefore, the present invention 

15 uses the categorized structures of such retrieval sites and portal sites in making 
a database for multimedia contents. 

FIG. 1 is a block diagram showing a database building apparatus for 
multimedia contents according to an embodiment of the present invention. 
FIG. 2 is a flowchart showing the major steps of a database building method 
20 for multimedia contents according to an embodiment of the present invention 
used in the apparatus of FIG. 1 . FIG. 2 will be frequently referred to in the 
following explanation. 
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For the present embodiment, an image is taken as an example of the 
multimedia contents. Referring to FIG. 1, the database building apparatus 10 
for multimedia contents according to an embodiment of the present invention 
is connected to the World Wide Web (WWW) 12, and has a web visitor 100, a 
5 parser 102, a filtering unit 104, a resolution decreasing unit 106, an image 
database 108, a category database 110, a keyword database 114, a universal 
resource locator (URL) database 112, and a control unit 120. 

The operating of the database building apparatus for multimedia 
;*f contents will now be explained. First, a user selects and visits an arbitrary 

\{\ 10 retrieval site in step 202, and clicks on the text of a category corresponding to 

jVj the field which the user is interested in on the visiting home page, which 

55 consequently is the object of database to be built in step 204. The contents 

;P classification of the retrieval site has a categorized structure. Responding to 

jjf the click by the user, the web visitor 100 loads a hyper text markup language 

15 (HTML) web page data mapped from the text in step 206. Next, the parser 
102 parses texts corresponding to sub-categories, or multimedia contents, 
which are texts corresponding to file names of images (in the present 
embodiment, for example, texts with extensions of "_.JPG", " .GIF", or 
"_.BMF"), in step 208. Next, it is determined whether or not the parsed text is 
20 included in a sub-category in step 210. When it is determined that the parsed 
text is included in the sub-category, the sub-category is visited in step 212 and 
step 206 is carried out. Meanwhile, when texts corresponding to the file 
names of images in the loaded HTML web page data are parsed, the images 



having the file names corresponding to the parsed texts are called in step 214. 
By doing so, the web visitor 100 hierarchically visits web pages in the 
retrieval site and calls images. Such operations are automatically executed 
and a means referred to as a web robot can be used to implement the 
5 operations. That is, it can be said that the web robot visits sites related to the 
selected URL, by spidering the selected URL and its offspring URL. 

Also, it is preferable that the parser 102 parses keywords showing the 
characteristics of the texts as well as the texts corresponding to the file names 

o 

;D of the images in the step 206. Since keywords are nouns in general, it is 

lii 10 possible to extract them using already known methods. 

jTs Meanwhile, graphics and the like for decorating web sites among 

a called images are regarded as noise and excluded in indexing. Therefore, the 

=P called images are filtered and then indexed. In the present embodiment, the 

|w filtering unit 104 determines whether or not the number of pixels of a called 

! ~ 15 image is equal to or greater than 128 in step 216. When the pixel number of 

the called image is less than 128, the called image is determined to be a thumb 
nail and then is filtered out and not indexed in step 218. When the pixel 
number of the called image is equal to or greater than 128, the called image is 
determined not a thumb nail and the resolution decreasing unit 106 decreases 
20 the resolution of the image in step 220. 

The image of which resolution is decreased is stored in the image 
database 108, and the identification information of the image stored in the 
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image database 108 and the category information of the visited web page data 
are stored in the category database 1 10 in step 222. 

Alternatively, the original data can be stored in the database without 
decreasing its resolution, and, without storing the called image to the database, 
5 the URL of the web page having the image can be stored so that the 
corresponding site can be linked. Also, preferably, in order for keywords to be 
linked to respective images, keywords corresponding to respective images can 
be stored together with the information on respective images stored in the 
image database to the keyword database 114. 

10 The control unit 120 determines whether or not the number of indexed 

images is equal to or greater than 1,000 in step 224. When the number of 
indexed images is less than 1 ,000, a control signal of a "low" level is output, 
and when the number is equal to or grater than 1,000, a control signal of a 
"high" level is output. Responding to the "high" level control signal, the 

15 parser 102 performs step 208, and responding to the "low" level control signal, 
it finishes parsing. That is, when the number of indexed images is equal to or 
greater than 1,000, the visit of a site is finished. 

In the database building method for multimedia contents according to 
the embodiment of the present invention, multimedia contents in the 

20 hierarchically visited categories, for example, thumbnail images of which 
image resolution is decreased, or original images, are semantically classified 
and stored in the corresponding database using category information of the 
corresponding sites. 
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Also, in the database building method for multimedia contents 
according to the present invention, URLs are used and the directory structures 
of the sites on the WWW are considered. For example, retrieval sites such as 
Google ™ or Altavista ™ provide retrievals based on URLs rather than 
5 category information. For example, when a search word "soccer" is input, the 
addresses of sites related to "soccer" are provided as the search results. Even 
when these retrieval sites are used, sites having semantically close relations 
with the corresponding search word are provided. 

In the database building method for multimedia contents according to 

10 another embodiment of the present invention, a structure that enables a 
semantical search of these retrieval sites is used for building a database for 
multimedia contents. FIG. 3 is a flowchart showing the major steps of a 
database building method for multimedia contents according to another 
embodiment of the present invention used in the apparatus of FIG. 1. 

15 Referring to FIG. 3, in the database building method for multimedia contents 
according to another embodiment of the present invention, first, the web 
visitor 100 visits an arbitrary retrieval site after selecting the site in step 302. 
Next, the user inputs a search word corresponding to the field of database 
which is wanted to be built in step 304. The search word corresponds to the 

20 identifier of the multimedia contents to be included in the database. Next, the 
web visitor 100 receives the addresses of sites related to the input search word, 
for example, HTML web page data having URL information in step 306. 
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Next, the parser 102 parses the addresses of the sites in the received 
HTML web page data in step 308. The web visitor 100 hierarchically visits 
sites corresponding to parsed addresses in step 310. Then, the web visitor 100 
loads root HTML web page data from the visiting retrieval site in step 312. 
5 The parser 102 parses multimedia contents in the loaded HTML web page data 
(for example in the present embodiment, texts corresponding to the names of 
images, such as texts having extensions of "_.JPG", "GIF.", or "_.BMF"), in 
step 314. Alternatively, an ALT tag which is used in the HTML language can 
be used. Since these image names or ALT tags are manually input by a web 

jjt 10 site authoi, the characteristics of images, more generally, the characteristics of 

fVj multimedia contents, are relatively well expressed. 

i" Preferably, the parser 102 also parses keywords representing the 

=p characteristics of parsed texts in step 314. Because keywords are generally 

fU nouns, it is possible to extract them in an already known method. 

-~ 15 Next, the web visitor 100 calls image data corresponding to the parsed 

text in step 316. Meanwhile, graphics for decorating web sites among the 
called image data are regarded as noise and must be excluded in indexing. 
Therefore, the filtering unit 104 filters the called images, filtering noise 
images out. In the present embodiment, the filtering unit 104 determines 
20 whether or not the pixel number of the called image is equal to or greater than 
128 in step 318. When the pixel number of the called image is less than 128, 
the image is determined to be a thumbnail and filtered out to exclude it in 
indexing in step 320. When the pixel number of the called image is equal to 
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or greater than 128, the resolution decreasing unit 106 determines the called 
image is not a thumbnail image but an image and decreases the resolution of 
the image in step 322. The image of which resolution is decreased is stored in 
the image database 108, and information on respective images stored in the 
5 image database 108 together with URL information of the visited web page 
data are stored in the URL database in step 324. 

Alternatively, the original data can be stored in the image database 
108 (without decreasing the resolution), and by storing the URL of the web 
page storing the image, instead of storing the called image in the database, the 

10 corresponding site can be linked. Preferably, keywords corresponding to 
respective images together with information on respective images stored in the 
image database 108 are stored in the keyword database 114. 

The control unit 120 determines whether or not the number of indexed 
images is equal to or greater than a predetermined number in step 326. When 

15 the number of indexed images is less than 1,000, the web visitor 100 loads 
root HTML web page data from the visiting retrieval site according to the step 
310. When the number of indexed images is equal to or greater than 1,000, 
visit of the site is finished. 

Meanwhile, in order to efficiently retrieve images, the characteristics 

20 of textures and/or colors can be extracted to be stored in a separate 
characteristic database (not shown in drawings). These characteristics can be 
extracted by Gabor filters which has scale and directional coefficients. For 
example, when a characteristic vector of an input image is calculated by a 
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filter formed by a combination of Gabor filters having 3 kinds of scale 
coefficients and 4 kinds of directional coefficients, and if average distributions 
are used for components of the characteristic vector, the characteristic vector 
can be expressed as shown in equation 1 below: 

5 ftexture = [ ti, t 2 ,t2, ... t 2 4» ] (1) 

Using the characteristic vectors, images are indexed. In the characteristic 
database, the characteristic vectors and image information corresponding to 
the characteristic vectors are stored. 

Similarly, it is possible to extract color characteristics to store in a 
jj; 10 separate characteristic database. Characteristic vectors showing color 

; • r 

Q primitives can be extracted from a color distribution histogram calculated in a 

Sl CIE LUV color space. For example, if each dimension of 3 dimensional color 

Mf space is quantized in four levels, it can be expressed as a 64-dimensional color 

?]=? characteristic vectors as shown in equation 2 below: 

S ~ 15 fcolor = [ Ci, C 2 ,C2, ... C64, ] (2) 

In the characteristic database, the characteristic vectors and image information 
corresponding to the characteristic vectors are stored. 

In the database building method for multimedia contents according to 
another embodiment of the present invention, thumbnail images of which 
20 image resolution are decreased, or original images, both of which are called 
from visited categories, are stored in the corresponding database, after being 
classified semantically using URL information of the corresponding sites. The 
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characteristics of textures and/or colors of called images are stored in a 
separate database. 

In the database building method for multimedia according to the 
present invention, multimedia contents on the WWW are semantically 
classified and indexed. Such a database building method for multimedia 
contents can be applied to multimedia contents such as TV news broadcastings 
or to shopping items using online multimedia expression. 

Though building a database of images is exemplified in the above 
embodiments, the present invention can be applied to various multimedia 
contents such as voice clip, and motion video clip having voices. That is, the 
present invention is not restricted to the above-described embodiments, and 
the scope of the present invention is determined by the accompanying claims. 

In the database built by the database building method for multimedia 
contents according to the present invention described above, multimedia 
contents dispersed on the WWW are well collected, and the multimedia 
contents are semantically well classified, using category information or URL 
information. Therefore, various retrieval method for multimedia can be used 
to efficiently retrieve wanted multimedia contents. Data which is similar to 
query data of multimedia data can be efficiently retrieved, particularly when 
using the method for retrieving multimedia contents according to the present 
invention. 

FIG. 4 is a block diagram showing the structure of a multimedia 
contents retrieval apparatus according to an embodiment of the present 
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invention. Referring to FIG. 4, the multimedia contents retrieval apparatus 
according to an embodiment of the present invention is linked to a server 44 
for providing an image retrieval service through the WWW 42, a kind of 
service provided through the Internet. 
5 The multimedia contents retrieval apparatus has a keyword retrieval 

unit 402, a display image selecting unit 404, an image display unit 406, an 
image retrieval unit 408, a user interface 410, and a web server 412 for 
communicating with the WWW 42. 

The server 44 has databases built by the database building method for 

10 multimedia contents explained referring to FIGS. 2 and 3, that is, an image 
database 440, a category database 442, a URL database 444, and a keyword 
database 446. Also, the server 44 has a web server 448 for communicating 
with the WWW 42. 

FIG. 5 is a flowchart showing the major steps of a multimedia contents 

1 5 retrieval method according to an embodiment of the present invention used in 
the multimedia contents retrieval apparatus of FIG. 4. FIG. 5 is referred to 
from time to time. In the present embodiment, an image is taken as an 
example of the multimedia contents, and it is assumed that databases are built 
using the database building method fof multimedia contents according to the 

20 embodiment of the present invention explained referring to FIG. 2. 

Referring to FIG. 5, first, a keyword corresponding to a query image 
from the user is received in step 502. First, when a user wants to retrieve 
"shoe", which has a certain shape, with a query image, the user operates a 
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recording medium, which stores program codes performing the multimedia 
contents retrieval method according to the present invention, in a computer, 
and inputs the keyword "shoe" to a retrieval keyword space on the operating 
screen displayed on the monitor of the user. 

Next, the keyword retrieval unit 402 retrieves words, which are 
identical to the input keyword, in the keyword database 446 of the server 44 
through the web server 412. When the identical word is retrieved, the image 
linked to the retrieved word is called in from the image database 440. By 
doing so, images corresponding to the input keyword are retrieved in step 504. 

Meanwhile, since there are a lot of images in the database, and the 
retrieved images obtained by using only a keyword in a voluminous database 
could include those images which are not visually similar to the wanted image, 
it is almost impossible to retrieve the wanted image with one retrieval using 
only a keyword. Therefore, it is preferable that the user checks with naked 
eyes some images among the retrieved images and selects similar images to 
feed the selected images back to the image retrieval unit 408 so that retrieval 
can be executed again. 

For this, the display image selecting unit 404 selects predetermined 
number of images among the images retrieved in the step 504 and the image 
display unit 406 displays the predetermined number of selected images for the 
user in step 506. 

Next, watching the displayed images with naked eyes, the user selects 
one or more images, which are similar to the image the user wants to find, and 
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determines those images as query images and provides information on them. 
In the present embodiment, responding to user's input, the user interface 410 
selects a plurality of shoe shape images and provides selecting information. 
By doing so, the image retrieval unit 408 receives information on candidate 
5 query images, which are decided to be visually similar to the wanted image, 
from the user in step 508. 

Next, the image retrieval unit 408 retrieves images which are similar to 
at least one among the color characteristic, the texture characteristic and the 
shape, among candidate query images that are determined to be visually 

10 similar to the query image, in the image database in step 510. 

In order to determine whether or not two images, that is, the query 
image and the retrieved image, are visually similar, similarity can be obtained 
by the calculated difference of characteristic vectors of the two images. In the 
present embodiment, it is assumed that the characteristic vectors of images are 

15 stored in a characteristic database (not shown in drawings). When k is the 
length of the texture vector, the difference between characteristics of textures 
of two images i and j can be obtained by the following equation 1 : 



Also, when k is the length of the color vector, the difference between 
20 characteristics of colors of two images i and j can be obtained by calculating 
the Euclidean distance of the two characteristic vectors using equation 2 
below: 
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...(1) 
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^ k=\ J 



1/2 



...(2) 



The retrieved image is determined to be the image which has the 
characteristic vector of the least difference to the characteristic vector of the 
given query image. 

When an image to be retrieved is an original image, the retrieved 
image is provided to the user as it is. When an image to be retrieved is a 
thumbnail image, the URL of the retrieved image, that is, the URL 
corresponding to the original image of the thumbnail image is used to call the 
original image after the site having the corresponding URL is connected 
through the Internet. The original image is then provided to the user. At this 
time, the URL information can be stored together with the thumbnail image in 
the image database 422. 

In retrieving based on contents, the user selects a set R of relevant 
query images. The relative weighted values of characteristics of colors and 
textures are determined depending on how tightly such sets of images are 
collected in a color space. That is, when | R | is the number of images in the 
query set, the weighted values are obtained by equations 3 and 4 below: 




...(3) 




...(4) 
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Next, when e is a predetermined small value for preventing any one 
characteristic from being extremely prominent, the weighted value can be 
obtained through the following equations 5 and 6: 

1 



rf texture 



...(5) 

1 



c ° lor d color + e 
5 ...(6) 

When N is a predetermined positive number, N nearest neighbors can 
be obtained by calculating equation 7 below: 

, f ) = ^textnre d texture (* > # ) + ™ crAo4 color 0 > # ) 

...(7) 

Generally, a query is specified by a single pair of a texture 
10 characteristic vector and a color characteristic vector. Therefore, in the 
present embodiment, when a plurality of query images are selected, the 
average of the characteristic vector and the color characteristic vector is used. 
That is, the values are obtained by equations 8 and 9 below: 



J texture _ J tex 



) 

, texture 

R q \ ,*R 

...(8) 



color 

15 ...(9) 



f = -^y f (,) 

J color | Lmd J coi 



. _ . color 
R\ imR 
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Retrieval based on contents can be generalized as follows. In a single 
query image using characteristic vectors f tex ture and f co ion first, when i is 
1 V ..,M2 and it is assumed that following conditions 10 and 11 are 
satisfied: 

d if s {i) )<d (f s U) ) 

"texture \J texture ' ^texture / ~~ texture \J texture ? ^texture } 

5 .-(10) 

(Here, xcS 

d if s° Kr:2) ) <d (f x U) ) 

M texture \J texture ' ^texture ) ~ texture\J texture > texture/ 

Then, the following equation 12 can be used: 

Stexture 

{^}...(12) 

10 Second, when i is l,...JV/2 and i^, it is assumed that following 

conditions 13 and 14 are satisfied: 

d if v (,) )< d if s U) ) 

color \J color ' color J ~ color \J color > color / 

...(13) 

(Here, xcS color ) 
d if v (V 2) )< d (f x u) ) 

w color \J color ' ^ color / — "color \J color ' color / 

...(14) 

15 Then, the following equation 15 can be used: 

Scalar = {s ( °} -(15) 
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Also, in a plurality of query images having f texture and f color , when i is 
l,...,Nand i^, it is assumed that following conditions 16 and 17 are satisfied: 

(Here, xcS texlure ) 

...(17) 

Then, the following equation 1 8 can be used: 

S={s®}...(\8) 

Next, the display image selecting unit 404 again selects predetermined 
number images among the retrieved images of which at least one of color 
characteristics, texture characteristics, and shapes are similar, and the image 
display unit 406 displays the predetermined number of selected images to the 
user in step 512. Here, it is preferable that the scope of retrieval is limited 
within the category of the query image and the neighboring categories. 

When the database is built according to the database building method 
for multimedia contents according to the second embodiment of the present 
invention explained referring to FIG. 4, it is preferable that the scope of 
retrieval is limited within the query image URL and neighboring URLs. The 
object image of retrieval can be the original image or the thumbnail image 
which is obtained by decreasing the resolution of the original image. When 
the object image of retrieval is the original image, retrieval can be done more 
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accurately, but, depending on the amount of data and the system performance, 
retrieval time can be extended. When the object image of retrieval is the 
thumbnail image, accuracy is lower but retrieval time can be shortened. 
Therefore a database can be managed appropriately. 

Responding to the user's input, the user interface 410 selects one or 
more images which are determined to be similar to the wanted image by the 
user when the user views the displayed images with naked eyes, and provides 
information on the images which are determined to be visually similar to the 
query image. By doing so, the image retrieval unit 408 again receives 
information on the images which are determined to be visually similar to the 
query image, from the user. The images which are received again are 
regarded as candidate query images. Next, the image retrieval unit 408 again 
retrieves those images, of which at least one among color characteristics, 
texture characteristics, and shapes, are determined to be visually similar to the 
query image, in the image database 422. That is, it is determined whether or 
not the wanted image is retrieved in step 514, and when the wanted image is 
not retrieved, steps 508 through 512 are repeatedly performed. Here, it is 
preferable that the scope of retrieval is limited within the category of the query 
image and neighboring categories. 

The multimedia contents retrieval method enables fast retrieval of 
wanted images in the database collectively storing multimedia contents. 

The database building method for multimedia contents and the 
retrieval method can be written as a program operating in a personal computer 

23 



or a server-class computer. The program codes and code segments forming 
the program can be easily drawn by computer programs in the field. The 
program can be stored in a computer readable recording medium. The 
recording medium includes a magnetic recording medium, an optical 
5 recording medium and a radio wave medium. 

As described above, using category information on the corresponding 
sites, the database building method for multimedia contents according to the 
present invention semantically classifies multimedia contents and stores them 
%Q in the corresponding databases. In the database built by the database building 

L k ! 10 method for multimedia contents according to the present invention, 

™ multimedia contents which are dispersed on the WWW are well collected and, 

]~ using category information or URL information, are semantically well 

s p classified. Therefore, various methods for retrieving multimedia contents can 

RJ be used so that wanted multimedia contents can be retrieved fast and 

^ 15 efficiently. 
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